Mining Data Streams with Skewed Distribution based on Ensemble Method
نویسنده
چکیده
In recent years, there have been some interesting studies on predictive modeling in data streams. However, most such studies assume relatively balanced and stable data streams but cannot handle well skewed (e.g., few positives but lots of negatives) and skewed distributions, which are typical in many data stream applications. In this paper, we propose an ensemble and cluster based sample method to deal with this situation. The study shows that this method has effective result on skewed data streams mining. Mining Data Streams with Skewed Distribution based on Ensemble Method
منابع مشابه
Investigation of linear and non-linear estimation methods in highly-skewed gold distribution
The purpose of this work is to compare the linear and non-linear kriging methods in the mineral resource estimation of the Qolqoleh gold deposit in Saqqez, NW Iran. Considering the fact that the gold distribution is positively skewed and has a significant difference with a normal curve, a geostatistical estimation is complicated in these cases. Linear kriging, as a resource estimation method, c...
متن کاملA General Framework for Mining Concept-Drifting Data Streams with Skewed Distributions
In recent years, there have been some interesting studies on predictive modeling in data streams. However, most such studies assume relatively balanced and stable data streams but cannot handle well rather skewed (e.g., few positives but lots of negatives) and stochastic distributions, which are typical in many data stream applications. In this paper, we propose a new approach to mine data stre...
متن کاملMin-wise independent sampling from skewed data streams
Min-wise independent hashing is a powerful sampling technique for estimating the similarity between sets. In particular, it has proved to be ubiquitous for mining data streams of large volume where the input sets are revealed in arbitrary order and the elements in a given set do not arrive consecutively. More precisely, for sets of elements E and attributes A the input is a stream of element-at...
متن کاملImproved Counter Based Algorithms for Frequent Pairs Mining in Transactional Data Streams
A straightforward approach to frequent pairs mining in transactional streams is to generate all pairs occurring in transactions and apply a frequent items mining algorithm to the resulting stream. The well-known counter based algorithms Frequent and Space-Saving are known to achieve a very good approximation when the frequencies of the items in the stream adhere to a skewed distribution. Motiva...
متن کاملA Novel Ensemble Approach for Anomaly Detection in Wireless Sensor Networks Using Time-overlapped Sliding Windows
One of the most important issues concerning the sensor data in the Wireless Sensor Networks (WSNs) is the unexpected data which are acquired from the sensors. Today, there are numerous approaches for detecting anomalies in the WSNs, most of which are based on machine learning methods. In this research, we present a heuristic method based on the concept of “ensemble of classifiers” of data minin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IJAPUC
دوره 4 شماره
صفحات -
تاریخ انتشار 2012